The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space
نویسندگان
چکیده
A new distance measure between probability density functions (pdfs) is introduced, which we refer to as the Laplacian pdf distance. The Laplacian pdf distance exhibits a remarkable connection to Mercer kernel based learning theory via the Parzen window technique for density estimation. In a kernel feature space defined by the eigenspectrum of the Laplacian data matrix, this pdf distance is shown to measure the cosine of the angle between cluster mean vectors. The Laplacian data matrix, and hence its eigenspectrum, can be obtained automatically based on the data at hand, by optimal Parzen window selection. We show that the Laplacian pdf distance has an interesting interpretation as a risk function connected to the probability of error.
منابع مشابه
یادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیکهای یادگیری معیار فاصله
Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملDiscussion of "Spectral Dimensionality Reduction via Maximum Entropy"
Since the introduction of LLE (Roweis and Saul, 2000) and Isomap (Tenenbaum et al., 2000), a large number of non-linear dimensionality reduction techniques (manifold learners) have been proposed. Many of these non-linear techniques can be viewed as instantiations of Kernel PCA; they employ a cleverly designed kernel matrix that preserves local data structure in the “feature space” (Bengio et al...
متن کاملOn the Existence of Kernel Function for Kernel-Trick of k-Means
This paper corrects the proof of the Theorem 2 from the Gower’s paper [3, page 5]. The correction is needed in order to establish the existence of the kernel function used commonly in the kernel trick e.g. for k-means clustering algorithm, on the grounds of distance matrix. The scope of correction is explained in section 2. 1 The background problem Kernel based k-means clustering algorithm (clu...
متن کاملA Geometry Preserving Kernel over Riemannian Manifolds
Abstract- Kernel trick and projection to tangent spaces are two choices for linearizing the data points lying on Riemannian manifolds. These approaches are used to provide the prerequisites for applying standard machine learning methods on Riemannian manifolds. Classical kernels implicitly project data to high dimensional feature space without considering the intrinsic geometry of data points. ...
متن کامل